-
-
Notifications
You must be signed in to change notification settings - Fork 54
feat(client): ✨ implement cross-provider model fallback system #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
This introduces a robust mechanism to define and utilize groups of equivalent models across different providers. If a request is made to a model that belongs to a fallback group, the client can now seamlessly rotate through other providers/models in that group upon exhaustion of credentials or errors. - Add `FallbackGroupManager` to parse and index fallback configurations from init args or the `MODEL_FALLBACK_GROUPS` environment variable. - Implement `_execute_with_fallback` in `RotatingClient` for standard completions, handling credential swapping and tier prioritization across providers. - Implement `_streaming_with_fallback` to support transparent failover during streaming responses. - Integrate fallback logic into the main `acompletion` entry point.
|
I'm diving into the new cross-provider model fallback system. Looking forward to reviewing the and how it integrates into the . Back soon with the full report! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing review submission with only critical findings.
src/rotator_library/client.py
Outdated
|
|
||
| # Temporarily swap credentials for this provider | ||
| original_creds = self.all_credentials.get(provider, []) | ||
| self.all_credentials[provider] = tier_creds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRITICAL: This is a major concurrency race condition. self.all_credentials is shared across all requests. Swapping it globally will affect every other concurrent request to this provider, causing them to use only the credentials from this fallback tier.
To fix this properly, _execute_with_retry should be updated to accept an optional credentials_override list, avoiding global state modification during request execution.
src/rotator_library/client.py
Outdated
|
|
||
| # Temporarily swap credentials for this provider | ||
| original_creds = self.all_credentials.get(provider, []) | ||
| self.all_credentials[provider] = tier_creds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRITICAL: Same race condition as in _execute_with_fallback. Global modification of self.all_credentials is not safe for concurrent streaming requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing review submission for fallback_groups.py.
| group = self._groups[group_idx] | ||
|
|
||
| # Move target to front, preserve relative order of others | ||
| reordered = [provider_model] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider deduplicating the group members here. If an entry appears multiple times in the group config, it will currently appear multiple times in the returned list (except for the promoted target).
| reordered = [provider_model] | |
| # Move target to front, preserve relative order of others, and deduplicate | |
| reordered = [provider_model] | |
| seen = {provider_model} | |
| for entry in group: | |
| if entry not in seen: | |
| reordered.append(entry) | |
| seen.add(entry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall Assessment
This PR introduces a powerful cross-provider model fallback system, which is a great addition for increasing the reliability of the client. The separation of the fallback logic into FallbackGroupManager is clean and well-structured.
However, I've identified a critical concurrency issue in how credentials are swapped during the fallback process (see my previous inline comments on lines 2062 and 2221 in client.py). Since self.all_credentials is a shared instance variable, modifying it during a request will interfere with any other concurrent requests being handled by the same RotatingClient instance. This needs to be addressed before the PR can be merged.
Architectural Feedback
- The tier-aware fallback logic correctly respects provider priorities while allowing cross-provider rotation.
- Critical Concern: The reliance on globally swapping
self.all_credentialsmust be refactored to use local state/arguments for the retry logic to ensure thread/async safety.
Key Suggestions
- Refactor
_execute_with_retryand_streaming_acompletion_with_retryto accept an explicitcredentialslist instead of reading fromself.all_credentials. - Improve the robustness of
FallbackGroupManagerby deduplicating and normalizing config entries (see suggestion on line 137). - Consider if
aembeddingshould also benefit from these fallback groups for consistency. - In
_streaming_with_fallback, consider if we should try other fallback entries if the first one returns only a[DONE]signal without any actual data.
Questions for the Author
- Was the exclusion of
aembeddingfrom the fallback system intentional? - For streaming, do you think we should try other fallback entries if the first one returns only a
[DONE]signal without any actual data?
This review was generated by an AI assistant.
- Update `.env.example` with `MODEL_FALLBACK_GROUPS` configuration template and detailed usage comments. - Add comprehensive section to `DOCUMENTATION.md` explaining the fallback algorithm, tier-first ordering, and target promotion logic. - Update project and library `README.md` files to highlight the new feature and provide updated `RotatingClient` instantiation examples.
Updates retry mechanisms to accept an optional `credentials_for_provider` argument. This eliminates the need to temporarily mutate `self.all_credentials` during fallback attempts, improving thread safety and code clarity. Also in this commit: - fix(core): deduplicate entries when generating fallback groups
Refactor the cross-provider fallback system to use a sequential provider approach instead of global tier-based ordering. - Iterate through fallback entries sequentially, allowing each provider to complete its internal rotation (Tier 2 then Tier 1) before moving to the next. - Remove `credentials_for_provider` injection logic, simplifying retry methods and ensuring provider-specific settings are respected. - Update documentation and examples to reflect the new deterministic behavior. - Adjust logging levels for better clarity during fallback operations.
This introduces a robust mechanism to define and utilize groups of equivalent models across different providers. If a request is made to a model that belongs to a fallback group, the client can now seamlessly rotate through other providers/models in that group upon exhaustion of credentials or errors.
FallbackGroupManagerto parse and index fallback configurations from init args or theMODEL_FALLBACK_GROUPSenvironment variable._execute_with_fallbackinRotatingClientfor standard completions, handling credential swapping and tier prioritization across providers._streaming_with_fallbackto support transparent failover during streaming responses.acompletionentry point.Related #85
Important
Introduces a cross-provider model fallback system in
RotatingClient, enabling seamless credential rotation across providers for equivalent models.RotatingClient.FallbackGroupManagerto manage fallback configurations from init args orMODEL_FALLBACK_GROUPSenv variable._execute_with_fallbackand_streaming_with_fallbackfor handling fallback logic inclient.py.model_fallback_groupsparameter orMODEL_FALLBACK_GROUPSenv variable..env.examplewith fallback group configuration examples.DOCUMENTATION.mdandREADME.mdto include details on cross-provider fallback groups.src/rotator_library/README.md.client.pyandfallback_groups.py.This description was created by
for 96c4bdb. You can customize this summary. It will automatically update as commits are pushed.